- Title
- Processing speech for dietary assessment in LLMICs
- Creator
- Dodd, Connor Thomas
- Relation
- University of Newcastle Research Higher Degree Thesis
- Resource Type
- thesis
- Date
- 2025
- Description
- Research Doctorate - Doctor of Philosophy (PhD)
- Description
- Dietary assessment is required to inform individual and population health, however traditional methods of capturing dietary intake can be inaccessible due to high complexity and resource costs. Self-reporting methods reduce the need for resources and research infrastructure, but human bias introduces systematic error that reduces the validity of results. Therefore, recent approaches have leveraged human-computer interaction facilitated by emerging technologies to reduce the burdens placed on respondents and thus reduce bias and enhance accessibility of this crucial information. The use of speech to capture spoken descriptions of foods is one such approach. Introduced as a less-complex alternative to written records capable of engaging populations with varying literacy, technological developments benefiting the collection and analysis of natural speech have spurred further research. Literature on the application of these technologies are promising but limited in scope, thus the overarching goal of this research is to expand this body of knowledge by application to a real-world scenario, specifically in a low-income context most impacted by resource and infrastructure constraints. To position this research, we1 engaged in a systematic literature review to synthesise existing literature against a conceptual framework to facilitate comparison of 21 articles across the nutrition and computing domains. Reviewed literature found speech to aid ease-of-use and reduce burdens related to capture, supporting its use for dietary assessment. Smartphones provided a convenient and approachable platform for efficient self-reporting through speech capture; however, processing of the resultant data constituted a considerable burden on research infrastructure and staff. Emerging speech-transcription and natural-language-processing technologies were utilised in a system to reduce that burden through automated understanding of unstructured speech descriptions of dietary intake. Our review presented the current methods used, and illustrated a conceptual model of such a system. The methods evaluated in reviewed papers utilising recent technologies to automate processing of dietary records were promising but had yet to be evaluated on real data captured for the purposes of dietary assessment. They were also conducted in high-income, and high-resource, countries. To evaluate their capacity to aid low-income countries, we conducted a study applying current methods to free-living dietary records captured using speech in Cambodia. We found that this context necessitated a further step, translating the transcribed data to a language with more available resources. We also identified several challenges impacting the viability of automated processing on our collected data and proposed an alternative system for semi-automatically assisting human dietitians. Images are another medium used to capture dietary food records, with a similar rationale to speech. Dual capture of both image and speech descriptions was used to capture complementary information in reviewed studies, addressing some of the identified challenges with speech data alone. Incorporating the semi-automated approach suggested in the previous study, our next study focused on the design and evaluation of a content management system to assist in manual processing of combined image-voice food records. Building on a co-design framework recommended for health technologies, we collaborated with expert stakeholders from the computing and nutrition domains to iteratively develop our system. Identified challenges, and the process of generatively forming (and evaluating) solutions to these was presented. Finally, we synthesised and reported the design principles inherent in the resultant system.
- Subject
- natural language processing; semi-automation; system design; speech-to-text; dietary assessment; food recording
- Identifier
- http://hdl.handle.net/1959.13/1520227
- Identifier
- uon:57449
- Rights
- Copyright 2025 Connor Thomas Dodd
- Language
- eng
- Full Text
- Hits: 13
- Visitors: 16
- Downloads: 3
Thumbnail | File | Description | Size | Format | |||
---|---|---|---|---|---|---|---|
View Details Download | ATTACHMENT01 | Thesis | 2 MB | Adobe Acrobat PDF | View Details Download | ||
View Details Download | ATTACHMENT02 | Abstract | 462 KB | Adobe Acrobat PDF | View Details Download |